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DETAILED ACTION 

This office action corresponds to application 10/749,518 filed 1/2/2004. 

Claims 23-48 have been examined and are pending prosecution. Response to arguments can be 
found on page 1 1 of this document. 

Response to Amendment 
The Examiner acknowledges the amendments. Claims 1-22 are cancelled and are replaced by 
new claims 23-48. These claims have been entered. 

Drawings 

With respect to the Applicant's amendments, the objections to the drawings are 
overcome. Accordingly, these objections are withdrawn. 

Claim Objections 

In light of the new claims, the previous claim objections have been withdrawn. However, the 
Examiner has objected to the following claims: 

As per claim 23, the Examiner asks the Applicant to add a colon (:) at the end of line 16 
and line 12 of claim 36. 

As per claims 24-25, 37, and 38, the Examiner asks the Applicant to change "mis- 
characterizing" to "mis-classifying" as to become consistent with the specification. 
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Claim Rejections - 35 USC § 112 

The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the 
subject matter which the applicant regards as his invention. 

Regarding claim 23, the phrase "substantially lower" renders the claim indefinite. 
Appropriate correction is required. 

Claims 30, 33, 37, 43, and 46 recite the limitation "said alarm criterion" in lines 5 or 6 of 
these claims. There is insufficient antecedent basis for this limitation in the claim. Furthermore, 
there is no support for alarm criterion in the specification. 

Claim 28 recites the limitation "said machine-readable code" in line 1. There is 
insufficient antecedent basis for this limitation in the claim. 

Claim Rejections - 35 USC § 102 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the 
basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by another filed 
in the United States before the invention by the applicant for patent or (2) a patent granted on an application for 
patent by another filed in the United States before the invention by the applicant for patent, except that an 
international application filed under the treaty defined in section 351(a) shall have the effects for purposes of this 
subsection of an application filed in the United States only if the international application designated the United 
States and was published under Article 21(2) of such treaty in the English language. 

Claims 23, 26, 28-31, 33, 34, 36, 39, 41-44, 46, and 47 are rejected under 35 
U.S.C. 102(e) as being anticipated by Yamanishi et al. ('Yamanishi' hereinafter). 
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With respect to claim 23, Yamanishi teaches a method of outlier detection comprising: 

generating a plurality of synthesized data (V data set; paragraphs 0064-0066), each 
representing a state (degree of outlier) within a given vector space (paragraphs 0061, 0064 and 
0066-0068), said generating including a random number generation (paragraph 0066; therein the 
equation expresses a random number); 

receiving a plurality of real sample data (data set x), each representing a detected real 
event as represented in said given vector space (paragraph 0061); 

forming a candidate sample set (data set z; paragraph 0066) comprising a union of at least 
a part of said plurality of synthesized data (V data set) and said plurality of real sample data (X 
data set), said candidate sample set having a starting population (paragraph 0066), said candidate 
sample set being unsupervised as to which members will be classified by said method as being 
outliers as the labeling denoted in the z set (paragraph 0066); 

generating a set of classifiers (t(i); paragraph 0066), each member of said set being a 
procedure or a representation for a function classifying an operand data as an outlier or a non- 
outlier as t(i) denotes a one-to-one function (paragraph 0066). 

initializing said set of classifiers to be an empty set (latter part of 0066). 

selectively sampling (sampling unit 23, paragraph 0087) said candidate sample data to 
form a learning data set (training data set W; paragraphs 0066-0069), said selectively sampling 
including 

i) applying said set of classifiers (label indicating abnormality; paragraph 0087) to 
each of said candidate sample data and, if any classifiers are extant in said set, generating 
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a corresponding set of classification results (as generating new rules for characterizing 
data; paragraph 0088), and 

ii) calculating a sampling probability value for each of said candidate sample data 
based, at least in part, on the corresponding set of classification results (as calculating a 
degree of outlier), and 

iii) sampling from said candidate data to form said learning data set based, at least 
in part, on said sampling probability values (as sampling unit 23 samples each data based 
on the calculated degree of outlier; paragraph 0087), 

such that said learning data set has a population (paragraph 0066) substantially 
lower than the starting population (filtering unit 21), 

generating another classifier based on said learning data set, 
updating said set of classifiers to include said another classifier, 

and 

repeating said selectively sampling, said generating another classifier, and said updating 
until said set of classifiers includes at least t members. The above three steps (i.e. generating 
another classifier... updating said set of classifiers... and repeating... is disclosed in Yamanishi 
wherein paragraph 0019, their system can be repeatedly executed); and 

generating an outlier detection algorithm based, at least in part, on at least one of said 
another classifiers, for classifying a datum as being an outlier or a non-outlier as rules for 
determining outliers. These rules determine the unfairness (abnormal or not) of data (paragraph 
0014 and figure 1 1 filtering unit 51). 
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With respect to claim 26 and similar claims 31, 34, 39, 44, and 47, Yamanishi teaches 
calculating a sampling probability value includes identifying a consistency within said set of 
classification results and calculating a probability of said identified consistency using a Binomial 
probability function (paragraphs 0071-0077). 

With respect to claim 28 and similar claim 41, Yamanishi teaches generating synthesized 
data generates said synthesized data in accordance with a given statistical likelihood of said 
generated data meeting an outlier criterion (paragraphs 0008 and 0066). 

With respect to claim 29 and similar claim 42, Yamanishi teaches generating an outlier 
detection algorithm generates the outlier detection algorithm such that said algorithm applies an 
aggregate of members of said set of classification algorithms, calculates a corresponding set of 
detection result data representing each of said aggregate's member's classification, and applies a 
voting scheme to said corresponding set of detection result data as a set of rules (paragraphs 
0017 and 0074 and figure 1 1). 

With respect to claims 30, 33, 43, and 46, Yamanishi teaches calculating of said 
uncertainty value includes identifying a consistency among said set of classification results, and 
calculating a probability of said consistency, assuming each classification result within said set 
has a 50-50 probability of representing an operand as meeting said alarm criterion, statistically 
independent of said operand and of all other classification results within said set (paragraphs 
0071-0075 and 0077; also threshold). 
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With respect to claim 36, Yamanishi teaches a system for classifying externally detected 
samples as one of at least normal and an outlier, comprising: 

a machine controller having a readable storage medium (paragraph 0140); 

a machine-readable program code, stored on the machine-readable storage medium 
(figures 1 and 3), having instructions to: 

generate a plurality of synthesized data (V data set; paragraphs 0064-0066), each 
representing a state (degree of outlier) within a given vector space (paragraphs 0061, 0064 and 
0066-0068), said generating including a random number generation; 

receive a plurality of real sample data (data set x), each representing an observed event as 
represented in said given vector space (paragraph 0061); 

form a candidate sample set (data set z; paragraph 0066) comprising a union of at least a 
part of said plurality of synthesized data and said plurality of real sample data (X data set), said 
candidate sample set having a starting population; 

generating a set of classifiers (t(i); paragraph 0066), each member of said set being a 
procedure or a representation for a function classifying an operand data as an outlier or a non- 
outlier as t(i) denotes a one-to-one function (paragraph 0066). 

initializing said set of classifiers to be an empty set (latter part of 0066). 

selectively sampling said candidate sample data to form a learning data set (training data 
set W; paragraphs 0066-0069), said selectively sampling including 

i) applying said set of classifiers to each of said candidate sample data and, if any 

classifiers are extant in said set, generating a corresponding set of classification results 

(as generating new rules for characterizing data; paragraph 0088), and 
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ii) calculating a sampling probability value for each of said candidate sample data 
based, at least in part, on the corresponding set of classification results (as calculating a 
degree of outlier), and 

iii) sampling from said candidate data to form said learning data set based, at least 
in part, on said sampling probability values, such that said learning data set has a 
population substantially lower than the combined first and second population, generating 
another classifier based on said learning data set, updating said set of classifiers to 
include said another classifier, repeating said selectively sampling, said generating 
another classifier, and said updating until said set of classifiers includes at least t 
members (as sampling unit 23 samples each data based on the calculated degree of 
outlier; paragraph 0087); and 

to generate an outlier detection algorithm based, at least in part, on at least 
one of said another classifiers, for classifying a datum as being an outlier or a non- 
outlier as rules for determining outliers. These rules determine the unfairness (abnormal 
or not) of data (paragraph 0014 and figure 1 1 filtering unit 51). 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 
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Claims 24, 25, 27, 32, 35, 37, 38, 40, 45, and 48, are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Yamanishi as applied to claims 23, 26, 28-31, 33, 34, 36, 39, 41-44, 46, 
and 47 above in view of Meek et al ('Meek' hereinafter) (U.S. Patent 6,728,690). 

With respect to claim 24, Yamanishi fails to expressly disclose the limitation of receiving 
a given mis-characterizing cost data associated with at least one of said synthesized samples, 
representing a cost of said outlier detection algorithm mis-characterizirig said at least one 
synthesized sample as a non-outlier. 

Meek, however, teaches this limitation in column 11, line 65-col. 12, line 6. Therein 
Meek expresses concern to the cost of making a mistake in classification (i.e. mistaking real 
email for junk email) for providing excellent generalization performance (col. 12, lines 7-10). 

Accordingly, it would have been obvious to one of ordinary skill in the data processing 
art at the time of the present invention to combine the teachings of the cited references because 
Meek's teachings would have given to Yamanishi's invention better indications of categories 
(i.e. positive or negative categories). Such categorizing is disclosed by Yamanishi as labeling 
(paragraph 0065, Yamanishi). 

With respect to claim 25, Yamanishi fails to disclose constructing said learning data set 
from said candidate sample data is further based on said mis-characterizing cost. 
Meek, however teaches this limitation in column 12, line 1-10). 

With respect to claim 27 and similar claims 32, 35, 40, 45 and 48, Yamanishi fails to 
teach calculating a sampling probability value includes identifying a consistency within said set 
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of classification results and calculating a probability of said identified consistency using a 
Gaussian probability function. 

Meek, however, teaches this limitation in column 2, lines 50-65 for a classifier function. 

Accordingly, it would have been obvious to one of ordinary skill in the data processing 
art at the time of the present invention to combine the teachings of the cited references because 
Meek's teachings would have given to Yamanishi's invention better indications of categories 
(i.e. positive or negative categories) and a classifier function (col. 2 lines 35-67). Such 
categorizing is disclosed by Yamanishi as labeling (paragraph 0065, Yamanishi). 

With respect to claim 37, as these limitations are similar to those of claim 24, the same 
supporting rationale is used in the rejection of this claim. Accordingly the motivation for this 
rejection is the same as that of claim 24. 

With respect to claim 38, Yamanishi fails to teach constructing said learning data set 
from said candidate sample data include instructions for constructing said learning data based, at 
least in part, on said mis-characterizing cost. 

Meek, however teaches this limitation in column 13, lines 53-64). 
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Response to Arguments 

Applicant's arguments with respect to claims 23-48 have been considered but are moot in 
view of the new ground(s) of rejection. The Examiner submits that claims 23-48 are taught 
accordingly by the references above. 

The Examiner would also like to note that although the Applicant's remarks on pages 12- 
14 are directed towards cancelled claim 19, the Examiner believes the arguments are to be made 
to claim 23 and similar claim 36 and their depending claims. Accordingly, the Examiner has 
addressed these claims in the rejection above. 
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Conclusion 

'Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for .reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

Contact Information 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Robert M. Timblin whose telephone number is 571-272-5627. 
The examiner can normally be reached on M-F 8:00-4:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, John R. Cottingham can be reached on 571-272-7079. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



Robert M. Timblin 




Patent Examiner AU 2167 
12/1/2006 
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